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Keeping it Real: Revisiting a Real-Space Approach to Running 
Ensembles of Cosmological N-body Simulations 
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In setting up initial conditions for cosmological N-body simulations there are, fundamentally, two 
choices: either maximizing the correspondence of the initial density field to the assumed fourier-space 
clustering or, instead, matching to the real-space clustering. As a stringent test of both approaches, 
I perform ensembles of simulations using power law models and exploit the self-similarity of these 
initial conditions to quantify the accuracy of the results. Originally proposed by Pen 1997 and 
implemented by Sirko 2005, I show that the real-space motivated approach, which allows the DC 
mode to vary, performs well in exhibiting the expected self-similar behavior in the mean ^(r) and 
P{k) and in both methods this behavior extends below the scale of the initial mean interparticle 
spacing. I also test the real-space method with simulations of a simplified, powerlaw model for 
baryon acoustic oscillations, again with success, and mindful of the need to generate mock catalogs 
using simulations I show extensive (powerlaw) tests for the halo mass function and halo bias in 
my simulations. Although requiring a few to many times more simulations than the standard, 
fourier-space method to reach a given certainty on the correlation function or power spectrum, 
I find that the real-space method is more reliable for modeling P{k) when the clustering level 
becomes significant on the scale of the simulation box. However, even for the niche application of 



simulating extremely red power spectra (n^ 



-2) or small box ACDM simulations the ^-sampled 



approach in its original form is not especially well-suited for this regime. I conclude by discussing 
some possibilities for optimizing or hybridizing the real-space method for more general use and an 
appendix demonstrates the potential for using perturbation theory to model the effect of the box 
scale on the simulated growth of structure in conventional P-sampled simulations. 
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I. INTRODUCTION 

Next-generation astronomical surveys will demand in- 
creasingly precise predictions from theory in order to 
properly interpret observations and constrain the nature 
of dark energy. As emphasized by p|, this will be a 
challenging task; inaccuracies in the predictions of halo 
abundance and halo bias, for example, can affect cosmo- 
logical inferences ^2], and measurements of the baryon 
acoustic oscillations (BAO) clustering feature will even- 
tually reach the stage where theoretical estimates of the 
shift of this feature from non-linear dynamics become 
important Although current state-of-the-art cosmo- 
logical N-body simulations, given a specific set of cos- 
mological parameters, are in many ways well-equipped 
to deliver highly precise predictions of the dark matter 
two-point correlation function and power spectrum for a 
relatively wide range of scales [3] , the difhcult-to-estimate 
covariances of these statistics are also crucial for placing 
constraints on cosmological parameters @|- To stay 
ahead of the observational demands, any methodologi- 
cal improvement to the current framework of setting up 
and running cosmological N-body simulations is therefore 
welcome, even if only relevant to certain applications and 
in certain regimes. 

In this paper I explore the predictions of a method for 



running ensembles of simulations that, unlike the con- 
ventional method, is designed to maximize correspon- 
dence between simulated real-space clustering statistics 
(e.g. cr8,C(^)) and the real-space properties of the as- 
sumed cosmological model. Originally proposed by Pen 
and implemented by Sirko Qp, this method allows the 
DC mode of each simulation (in an ensemble of simula- 
tions) to vary according to the clustering power on the 
scale of the box in much the same way that the density 
within randomly placed boxes in the real universe will 
fluctuate around the mean density. Although it is now 
common practice, using the conventional method, to set 
the DC mode in each simulation to zero, in the early days 
of fully cosmological N-body simulations this was not al- 
ways done [e.g. Q. This issue has also been discussed in 
the context of artificially changing the DC mode of an 
existing simulation as a way of scaling a simulation com- 
pleted with a certain set of cosmological parameters to a 
slightly different model [iMl- 

In the Sirko Q framework the initial power spectrum 
used with the Zeldovich [TJ] (and by extension 2LPT 
[l^[ll]) approximation is convolved such that the mat- 
ter correlation function matches exactly the linear theory 
correlation function for r < Lbox/2, while for r > Lbox/2 
the correlation function is set to zero. With this in mind 
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^ Once publicly available, the code can still be obtained through 
http: / /web. archive. org 
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Sirko refers to this approach as "^-sampled" initial con- 
ditions (ICs) , while the standard method is referred to as 
"P-sampled" , since by using an unconvolved linear the- 
ory power spectrum with the Zeldovich approximation 
the initial conditions are instead matched to the fourier 
space clustering statistics. The ^-sampled strategy, by 
matching the correlation function out to r = Lbox/2, 
should avoid biases on all real space statistics, since the 
rms overdensity in spheres, cr(i?), is simply related to the 
correlation function, and the halo mass function to good 
approximation is only a function of a{R). Without this 
convolution these real space statistics become biased (i.e. 
from P{k) = for fc < 27r/Lbox), as discussed by [7\ and 

Although a number of groups have published results 
using the initial conditions code developed by Sirko, 
which was the among the first include the 2nd order La- 
grangian corrections 

[3 to the Zeldovich [3 dis- 
placements, the code is very seldom used to generate £- 
sampled ICs. To my knowledge, only Reid et al. [131 
have utilized the code in this mode, citing the success 
of convergence tests in [l^. In that study they cre- 
ate mock catalogues from a suite of 42 simulations with 
Lhox = 558h~^ Mpc, and N = 512^ for comparison 
with SDSS LRG data 0. They chose the ^-sampled 
method for this task, citing the attractive feature of al- 
lowing the DC mode of the box to vary, thereby modeling 
the power spectrum covariance of real surveys more re- 
alistically. [13 and Appendix A of [ISj present a wide 
variety of convergence tests that explore the effects of 
increasing the resolution with either fixed initial condi- 
tions (i.e. with a particular randomly sampled value for 
the DC mode) or for a set of a few initial conditions 
realizations. More recently, [2l| argued that the DC 
mode should be re-introduced and compared the results 
of five -Lbox = 20h~^ Mpc, ACDM simulations using the 
^-sampled method to a high-resolution, standard-method 
simulation with Lbox — SOh~^ Mpc, finding good corre- 
spondence between the results for the variance of the halo 
mass function. 

This study systematically explores the predictions of 
the two different methods using relatively large ensem- 
bles of simulations (> 50) and a diverse set of initial 
conditions. Where the results disagree it may be am- 
biguous which approach is more accurate, therefore I fo- 
cus on pure powerlaw models which should evolve self- 
similarly. This allows powerful self-consistency checks 
of the simulation results, since each output should, in a 
statistical sense, resemble scaled versions of earlier and 
later outputs. Fig. [T] shows an illustration of this, com- 
paring snapshots from cosmological iV-body simulations 
(initially n — —1 power spectrum) at three different times 
in the left panels and on the right showing a zoomed- 
in view of the previous output. Each image is a slice 
through 1 / 8th of the simulation volume and in each frame 
particles are randomly sampled from the simulation so 
that each side-by-side panel compares the same number 
of particles. Importantly, the non-linear scale, which 



grows with time, is shown on each plot. The zoomed- in 
plots on the right have the same value of Rni , relative to 
the axes shown, as the simulation outputs immediately 
to the lefiI3. Qualitatively, for example in the sizes of 
voids and large structures, the zoomed-in slices on the 
right bear a remarkable resemblance to the simulation 
outputs shown adjacently to the left0. More rigorously, 
one can quantify the differences between the two images 
in a statistical way and in the limit of many realizations 
the scaled-forward density field should be statistically 
equivalent to the density field at the later output. To 
the extent that this happens the simulation can be said 
to evolve with self-similarity and it is this property of 
power-law initial conditions that I exploit throughout to 
evaluate the accuracy of both the conventional and 
sampled methods. 

These kinds of "self-similar" tests were decisive in con- 
firming the accuracy of the first generation of fully cosmo- 
logical N-body codes 22|. I also show a few tests where, 
instead of a pure powerlaw, I simulate initial conditions 
consistent with a configuration space powerlaw times a 
gaussian bump. Investigated in (23j in great depth as 
a simplified model of BAO, this test is self-similar in a 
different sense - namely that the evolution of the dark 
matter clustering should only depend on the ratio of the 
scale of non-linearity to the scale of the BAO. I include 
these initial conditions as another test of the ^-sampled 
method and as a valuable cross-check for the conventional 
method's predictions for the shift of the BAO feature. 

I test these models extensively, focusing on pure pow- 
erlaw models with spectral slopes of n = —1, —1.5, and 
—2, and on the three models explored in [23| which resem- 
ble n — —0.5, —1, and —1.5 powerlaws in fourier space. 
I compare predictions from the two methods, showing 
results for the matter correlation function in § IIIIl the 
power spectrum in § IIVI and statistics of the halo popu- 
lation in § [V] In § I Villi I discuss my results and in § HXl 
I summarize my main conclusions. Two appendicies dis- 
cuss the effects of the integral constraint measurement 
bias on the correlation function and halo clustering bias, 
and another appendix compares results presented in § IIVI 



^ In Fig.[T] is defined as the scale where the rms overdensity in 
randomly placed spheres of radius, R, reaches the value of one. 
Different definitions for the non-linear scale are used elsewhere in 
the paper. This convention was chosen merely for visual appear- 
ance since, although each non-linear scale grows with the same 
rate, at a given epoch R„i is a larger value than other choices. 

^ If unconvinced of this, look at the zoomod-in density fields on 
the right and try to tell if this density distribution is a fiipped 
or mirrored version of the plot immediately to the left. If it is 
hard to tell on first glance this is evidence that the two images 
are statistically similar. Note that, formally, this is not possible 
with ACDM initial conditions but, largely because the ACDM 
power spectrum is roughly a powerlaw in simulations using boxes 
smaller than a few hundred Mpc, the eye could fairly easily be 
tricked if an analogous plot were made. Fig [T] is only intended 
as a pictorial illustration of the self-similarity of a cosmological 
density field. 



FIG. 1. Left panels Slices from an N-body simulation showing particles in the lower l/8th of the simulation volume. The axes 
extend out to the box scale and the plots appear in chronological order - the earliest output at the top and the last output at 
the bottom. The nonlinear scale, Rni, is shown in each plot. Right panels Slices from the lower-right quadrant of the preceding 
output. The non-linear scale is shown on each plot, indicating that the slices have the same level of clustering as the outputs 
shown to the left. 
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to a perturbation theory model for the suppression of the 
growth of structure caused by the finite size of the simu- 
lation box. 



II. INITIAL CONDITIONS 
A. Overview of the ^-sampled Method 

In the ^-sampled method implemented by [sj, the (real 
space) matter correlation function for a given cosmologi- 
cal model is the fourier transform of the power spectrum 



(Pk 

(2^ 



1 

2^2 



P{k) 



sin kr 
kr 



■ k^ dk. 
(1) 

To convolve P{k) such that the simulated ^(r) is an exact 
match to Eq.[l]for r < Lbox/2, but is zero for larger sepa- 
rations, one simply fourier transforms ^(r) while cutting 
off the integral at Lbox/2 since £_{r) — for r > Lbox/2, 



k/2 



sin kr 
kr 



dr. 



(2) 



I will refer to this result as ProaK^) to emphasize that this 
power spectrum is designed to maintain correspondence 
with the real space properties of the cosmological den- 
sity field. Importantly, Prcai(O) can be non-zero even if 
P(0) = 0; this term sets the fluctuations in the DC mode. 
In Appendix A of [s^ , using the subscript "uni" to denote 
variables in the model of interest and "box" to identify 
the parameters of the simulated volume, these fluctua- 
tions are mapped self-consistently onto fluctuations in 
cosmological parameters. 



Ho,] 



Ho,, 



m.box 



^^A,box — ^^A,uni(l ^ 
, 5 ilr^ 



6D(1) 
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(4) 
(5) 

(6) 



where Aq is a gaussian variable with mean zero and vari- 
ance Preai(0)/L^Q^ and D{1) is the value of the linear 
growth function at the present epoch. Note that Eq. [3] 
implies that in length units the box size of each sim- 
ulation varies with the value of </>, whereas in standard 
length units (e.g. Mpc) the box size remains fixed. Sim- 
ilarly the box integrated mass, Mbox = Pm-^box' varies 
from box-to-box in /i~^Mq units, but is fixed in Mq 
units. 

Of crucial importance in deriving Eqs. is the re- 
lationship between the scale factor of interest, auni, and 
the corresponding scale factor in a particular realization, 
Obox- In |8] this relationship is set by an approximate 
formula which determines abox as the epoch where the 
age of the universe in the box is the same as the age of 



the unperturbed universe during the epoch of interest^ 

(7) 



Obo 



[8j justified this formula by arguing that the ratio of the 
average density of the universe to the average density of 
a given box, Puni/pbox = aLx/^fmi' is simply related to 
the overdensity of the box, which grows according to the 
linear theory growth function. Eq.[7]can also be obtained 
by Taylor expanding the perturbed i?(abox) for small (f> 
and equating the age of the universe in the box to the 
age of the universe at the epoch of interest. 



B. Integration of Particle Trajectories 

Having set up the initial conditions, determined the 
perturbed cosmological parameters of a given realization 
and computed the relevant scale factors, abox, tor the 
epochs of interest, the initial conditions can be evolved 
using any cosmological N-body code. I use the publicly- 
available Gadget2 code with no modifications [24|. As 
a hybrid Tree-based code with PM grid for large scale 
forces, Gadget2 is a highly scalable N-body code which 
compares well to other codes used in the literature [e.g.|3l ■ 
Unless otherwise noted I show results from simulations 
with 256'^ particles and a 512^ PM grid. Initial redshifts 
were set using A'^{k-^y) < 0.001 as a rule of thumb [25j . 
and the force softening was set to l/20th the initial mean 
interparticle spacing. 



C. Statistics of the Cosmological Density Field 

With ensembles of simulations in the conventional 
method, the measurements of dark matter clustering at 
a given output, Ouni, can typically be combined, and the 
statistical precision improved, with a simple average. In 
^-sampled simulations this procedure involves the extra 
step of weighting the measurements in each realization 
by factors of aj^^^. Measuring ^(r) (in any context) is 



equivalent to calculating [26 



r(r) 



(8) 



where h is the average number density of particles in 
the volume and r(r) is the number density of particles 
separated by length r. For an output from given real- 
ization using ^-sampled ICs I am interested in ^uni('') = 



was the first to appreciate that a^ox 7^ i^uni but instead 
proposed to set at,ox by matching the amphtude of the linear 
growth function in the perturbed cosmology. I return to this 
issue in lVIIII 
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r (r) / TT-uni — 1 , and since 
ing is simply 



?T-box(abox/auni)^ this scal- 



box 



(r) 



abo 



^1=1 — 1 (CboxM + 1)-1 

(9) 

Note that in Eq. |9] I have omitted an extra factor of 
— 2(auni/abox)'^ that appears in Q. I am unable to repro- 
duce the results in [8| including this terni, which could 
arise from using optimal estimators like or [2^. I 
use the "natural" estimator, ^(r) = "EiDDi/ RRi — 1, with 
RRi derived analytically in the conventional way by mul- 
tiplying the mean number density of particle pairs by the 
volume of the spherical shell. Q does not state whether 
this or some more sophisticated estimator is used. 

Another important subtlety to correlation function es- 
timation worth mentioning is that for P-sampled simu- 
lations, an integral constraint is is artificially imposed on 
the scale of the box (e.g. [1^ Appendix B]). It was no- 
ticed in re-producing Fig. 9 of [Ml, that the P-sampled 
correlation function results presented there were com- 
puted without proper correction for this measurement 
bias. As discussed in Appendix \^ this negates part of 
the claimed advantage of the ^-sampled method, which 
had agreed much better with the expected linear theory 
result on scales approaching the box. 

Note that the integral constraint ~ which arises from 
assuming incorrectly that the average number density of 
pairs in the volume i^ox equal to the cosmic value 
- does not apply to ^-sampled simulations because the 
mean density varies self-consistently from box to box. I 
will return to this issue in § IVI[ pointing out the close 
resemblence of the integral-constraint correction fEg. lASp 
to the (obox/auni)^ factor in Eq. [HI 

The box-to-uni correction factor for the power spec- 
trum (in ^-sampled simulations) is determined straight- 
forwardly from the fourier transform of Eq. |9l 



Puni(fc) 



flbc 



{6{k)S*(k)) . 



(10) 



If computed using h Mpc^^ units in fc-space, the mea- 
surement of the bracketed term in Eq. [TOl and the book 
keeping required to count of the number of modes con- 
tributing to each k for error estimation can be a fairly 
complicated exercise since the size of the simulation box 
varies in length units. In practice, it is much sim- 
pler (bearing in mind that box size is fixed in Mpc units) 
to convert the particle positions to Mpc units before ap- 
plying the standard methods of mapping the particles 
onto a gridded density field 29., I use CIC] and comput- 
ing its fourier transform. Similarly, measuring the halo 
mass function is considerably simplified by converting the 
particle positions to Mpc units and the particle masses 
to Mq units (from h~^MQ units) before using standard 
halo finding methods. I use the FOE halo finder [s^, Hfj 
with a linking length of 0.2 times the mean interparticle 
spacing. 



D. ^-sampled ICs with Powerlaw Models 

For powerlaw models, where P{k) = Aa^k", the task 
of computing Eq. [3] is made substantially easier because 
an exact analytic solution for ^(r) is known in this case. 



2tt^ rg+^ (2 + n) 
r{3 + n) sin((2-f n)7r/2) 



[3^ . Eq. [2] therefore becomes 



Prcal {k) 



k/2 



-(n+1) 

kr 



(11) 



(12) 



and the DC power can be straightforwardly expressed as 

ribox/2 



^real(O) =4^rj; 



.n+3 



^-(n+l) dr (13) 



2"+^7r / ro 



n+3 



(14) 



Analytic and special-function solutions to Eq. [T^] exist 
for certain powerlaws. In this study I am interested in 
n ~ —1, —1.5 and —2 which can be expressed by 

Prcal,n=-l(fc) = 47rr2 Si(kLbox/2) k"! , (15) 

Pcal.n=-1.5(A;) =2^/2^3/2^3/2 S(VkW/0F)k-l-5(16) 

(fc) = 8^ro sin2(fcLbox/4) k'^, (17) 

where Si(x) is the sine integral and S(x) is a Fresnel in- 
tegral. These formulae can be very useful for generat- 
ing accurate initial conditions, especially for steep power 
spectra. I show these power spectra in Fig. [21 fixing 
ro/Lhox — 1/16 to set the relative amplitudes. Notice 
that steeper powerlaws have larger DC power, easily seen 
on the plot as the asymptotic value of Proai(^)/ibox 
small k. Noticing that P{k) does not go to zero at small 
k for the P-sampled powerlaws, one might be concerned 
that these models are unphysical. However, despite the 
high levels of large scale clustering power the rms over- 
density in spheres and other statistics remain finite for 
n > -3. 



E. Scale free? 

Although pure powerlaw models are often referred to 
in the literature as "scale free," since P{k) — Ak"^ is 
featureless, the ^-sampled initial power spectra shown in 
Fig. [5] clearly depend on the choice of Lbox- In practice, 
these oscillatory features die away in simulations and the 
effect of Lbox is merely to change the variance of the DC 
mode (which is set by Prcai(0)/ibox). 

Since dark energy does introduce a new scale into the 
problem (e.g. the age of the universe when pm = Ph), I 
consider only ilm.uni = l-0,r2A,uni = 0, ri^.uni = so as 
to keep the simulations as "scale free" as possible and al- 
low the self-similar tests discussed in the next section. In 
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FIG. 2. A comparison of P-sampled and ^-sampled pure 
powerlaw models, ^-sampled power spectra are computed 
from Eq. [2] and used to generate initial conditions for that 
method. ro/Lbox ~ 1/16 is chosen to set the overall ampli- 
tude of each model. 



the Zeldovich 14] and adhesion [33|, |3J] approximations 
(as in linear theory), the effect of dark energy on struc- 
ture formation is entirely captured by changing the linear 
theory growth function. [35 1 and [36| convincingly argue 
that this approximation is remarkably accurate even in 
the non-linear regime - the second order effect of dark 
energy is relatively small (see also Chapter 2). Therefore 
the results of my = 1 tests should still be quite rele- 
vant to studies that include a dark energy component. 

As one final comment on the scale-free nature of my 
simulations, throughout I adopt, as a time variable, 



a 
a* 



box 



NL 



(n+3)/2 



(18) 



where fcbox = 27r/Lbox and /cnl is defined by the di- 
mensionless linear theory power spectrum, A^(fcNL) = 1- 
This choice is simply related to the tTmiss formula of [s^l , 
which quantifies the missing power on the scale of the 
box in P-sampled simulations. I adopt Eq. [TH] for ease 
of comparison with (ssj and because the (Jmiss formula in 
[s?! would be inappropriately applied to ^-sampled simu- 
lations, which have a turnoff in Phik) near the box scale 

(Fig. [a. 



III. £,{r-) RESULTS 

A. Powerlaw Models 

Fig. [3] shows my primary results for the self-similar 
scaling of the matter correlation function. The x-axis is 
shown in r/rg units where £,l{'>'o) = 1- Insofar as the 
dark matter clustering is negligibly affected by numeri- 
cal limitations such as the finite scale of the box or the 



scale of the force softening, with this scaling the correla- 
tion function results from different outputs should all lie 
upon the same line. To the extent that this is achieved 
the correlation function can be said to evolve with self- 
similarity and it is clear from Fig. |3l excluding the first 
outputs which are severely affected by transients from 
initial conditions, that over a wide range of scales the re- 
sults from these relatively modest, N = 256^, simulations 
do fall upon the the same locus as expected. This locus 
is different for each powerlaw; for steeper power spectra 
(e.g. n — —2) power is transferred from large scales to 
small scales and the no n- linear growth of ^(r) out paces 
linear theory whereas for shallower power spectra (e.g. 
n > — 1) there is so much small scale power that the pro- 
cess of halo formation and collapse causes the non-linear 
growth to fall behind linear theory in a process some- 
times called "pre-virialization" [28]. In the language of 
the halo model ^] this implies that the predicted lin- 
ear theory clustering on small scales is so high that the 
amplitude of the 1-halo term is below the linear theory 
clustering amplitude on those scales. The n = —1 case 
falls between these two extremes and the amplitude of the 
correlation function is both above and below linear the- 
ory, depending on the regime. (For a bracketing case of 
an even shallower power spectrum see, e.g., the n = —0.5 
results in [H, Appendix A].) 

In Fig. [31 the ^-sampled and P-sampled methods gen- 
erally agree on the shape of the self-similar solution. This 
is significant for the ^-sampled results, on some level ver- 
ifying the method. Alongside the measurements in each 
case the results from higher resolution simulations are 
shown. For n = — 1 and n = — 2 this comparison is made 
by numerically fourier transforming the non-linear power 
spectrum fitting functions published in [s^; note in the 
n = —1 case I include subtle but important corrections 
to their fit at small k / k^i as determined in (23l Appendix 
A]. For n = —1.5, I compare with ^(r) measurements 
from 10 P-sampled simulations with N — 512'^ 23, Ap- 
pendix A]. These high- resolution results are used more 
quantitatively in Fig. 21 

Important to point out in Fig. O is that the error on 
the mean ^(r) for the ^-sampled method is much larger 
than the P-sampled simulations, despite running 2.5 x as 
many ^-sampled simulations in the n — ~1 and n — —2 
cases to improve the statistics, and 7.5 x as many simu- 
lations in the n = —1.5 case to further ensure that the 
mean of the distribution of DC modes in the simulation 
set was very close to zero. For the ^-sampled n = —1 case 
there are not enough realizations (50) to ensure that the 
mean of the distribution of DC modes agree with zero at 
the necessary level of accuracy (a criterion which will be 
discussed in ? IVIII) . and as a result, although difficult to 
see from the large error bars on the mean, there are strong 
deviations between the measurements of the mean ^(r) 
and the self-similar fit at large r for each output. The ef- 
fect is more clearly illustrated in Fig. |31 which shows the 
agreement between the ^ (r) measurements from my sim- 
ulation set and the high-resolution self-similar solutions 
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FIG. 3. Measured matter autocorrelation functions from conventional P-sampled (upper panels) and ^-sampled (lower panels) 
ensembles of simulations. The left-most panels show results from an initially n = — 1 power spectrum, middle panels show 
results from n = —1.5, and the right hand panels show n = —2. In each plot the a;-axis is scaled by the non-linear scale, 
ro, where ^^(ro) = 1 so that, if evolving with the expected self-similar behavior, the outputs should lie upon the same locus 
of points. The y-axis is scaled by ^L{r) = (ro/r)""*"^. Black lines show high resolution results from P-sampled simulations 
for comparison (see text). Error bars show measured error on the mean. Note that the first outputs in each plot are severely 
affected by transients from imtial conditions. 



more quantitatively. Near r/Lbox ~ 10~^ and indepen- 
dent of output, the ^-sampled n = —1 results systemati- 
cally fall well below the self-similar solution. 

With this well- understood caveat, the agreement with 
the high-resolution self-similar results is fairly good and 
excluding the initial and final outputs in each case my 
simulation set tends to match the self-similar evolution to 
better than about 5% in most outputs and on most scales. 
The last output is excluded from this conclusion since the 
linear theory clustering level is so high that one expects 
departures from the true non-linear clustering from the 
suppression of power on the scale of the box. Also, the 
correction for the integral constraint, which assumes a 
linear theory correlation function in ^(i?box), hkely be- 
comes inaccurate in this regime as well (Appendix . 

On scales much smaller than the box size. Fig. |4] shows 
that as structure evolves in the simulations the self- 
similar behavior extends below the scale of the initial 
mean inter-particle spacing, in some cases approaching 
the force softening. Since at best the initial conditions 
of the simulations only match the self-similar correlation 
function down to the initial mean interparticle spacing 
this result is non-trivial and difficult to anticipate from 
first principles, [i^l, using n — —1 simulations, show 
that Fourier modes in the non-linear regime are largely 



determined by the collapse of large-scale modes rather 
than by evolution of power initially on those scales. How- 
ever, [41j and collaborators have argued that the common 
practice of setting the force softening significantly smaller 
than the initial mean interparticle spacing introduces er- 
rors which arise from the possibility that with this choice 
the equations of motion for the particles are no longer 
true to the Vlasov-Poisson fluid equations. Despite this, 
they also find that ^(r) (and P{k)) can reliably be mod- 
eled below the scale of the mean interparticle spacing. 
The main effect of aggressive force softening is to cause 
~ 5% disagreement with the true non-linear ^(r) (and 
P(fc)) on scales larger than the mean interparticle spac- 
ing. The P-sampled results shown in Fig. 2] are in qual- 
itative agreement with the simulations presented in [41 1 
in the sense that accurate non-linear behavior is observed 
below the mean interparticle spacing and on larger scales 
the measurements are consistent with the self-similar so- 
lution also at the level of ~ 5%. Although beyond the 
scope of this paper, it would be interesting to run the 
P-sampled simulation set with less aggressive force soft- 
ening (e.g. half the mean interparticle spacing) to test if 
the measured error on the mean ^(r) (and P{k), see §|IV| 
is detectably smaller, as predicted in [41]. At any rate, 
for all three powerlaws the self-similar behavior extends 
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FIG. 4. Measured correlation functions from simulations relative to high-resolution results for the self-similar scaling (^ss(r-); 
black lines in Fig. |3]). Points show the mean £,(r). Panels are organized as in Fig. [3] (top panels: P-sampled results, bottom 
panels: ^-sampled results, n = —1,-1.5 and —2 from left to right). Vertical lines show relevant numerical scales: the initial 
mean interparticle spacing (dotted black), the Particle- Mesh Grid (dot-dashed black), and the force softening (dashed black). 



well below the scale of the mean interparticle spacing; it 
does not significantly depend on whether power is being 
rapidly "transferred" to smaller scales as for n — —1.5 
and n = — 2 or whether the non-linear growth proceeds 
less quickly than the linear theory prediction on small 
scales (i.e. r < ro), as for n — —1. 

B. Powerlaw Times a Bump Results 

As discussed in depth in [2^, a real-space powerlaw 
times a bump can be used as a self-similar numerical 
test in addition to providing insight into the non-linear 
physics of the evolution of the BAO feature. In this case, 

6c (r) - (^)"^' (1 + ^bumpe-('-'--°)'/2--), (19) 

and for resemblance to the ACDM correlation function I 
chose Abump = 2.75, CTbao/'^bao = 0.075, and powerlaws 
of n = -0.5, -1, and -1.5. Unlike ACDM, this setup 
can be evolved much further than erg ^ 1 to investigate 
the non-linear physics of the problem. I compare results 
in Fig. [5] from ^sampled ensembles of simulations with 
N = 256'^, rbao/^box = 1/20 to gaussian fits to the re- 
sults of the P-sampled, — 512^, rbao/-^bao = 1/20 
sets of simulations in [2J|. At least qualitatively the two 
simulation sets agree well. The slight small-scale discrep- 
ancy with the gaussian fit in the last output shown for 



the n = —1 case is a problem with the fit itself, as ap- 
parent in the P-sampled n = —1 results shown in Fig [5] 
while the discrepancies with the fit on scales larger than 
the bump stem from the mean of the fifty, randomly- 
sampled DC modes being slightly above zero, much like 
the pure powerlaw case for n = — 1, a problem easily fixed 
by running more realizations. The slight mismatch in the 
amplitude of the bump in the n = —1.5 case is mirrored 
in the P-sampled, A^ — 256"^, rbao/-^box = 1/20 results in 
Fig. 8 of and the discrepancy, at large r/rbao, for the 
n = —0.5 case is in a regime where the amplitude of the 
correlation function is very weak and difficult to measure. 
Overall, the evolution of the bump is remarkably similar 
to previous results and, importantly, the non-linear shift 
of the bump in the n — —1.5 case and lack of a shift 
in the n = —0.5 and —1 cases agree with P-sampled 
simulations. Insofar as these two methods are equally 
valid approaches to setting up and running cosmological 
N-body simulations, this conclusion should be reassuring 
to the wider effort to characterize the non-linear shift of 
the BAO peak using standard, P-sampled simulations. 

As with the powerlaw results in the previous section, 
despite running a few times more simulations than the 
P-sampled ensembles, the measured error on the mean 
^(r) is still quite large. (In [2^, I typically ran only 
seven realizations in each case.) This seems to be the 
case for a variety of powerlaws and in simulations with 
and without a bump. A close inspection of the original 
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FIG. 5. Correlation function results from ensembles of 50 ^- 
sampled simulations using initial conditions consistent with 
a powerlaw times a gaussian bump as a simplified model of 
baryon acoustic oscillations. Dot-dashed lines show gaussian 
fits to the high-resolution simulations presented in 23, Fig. 
3]. Typical errors on the mean are shown offset to the right. 



^(r) results in Q likewise show in ACDM simulations 
that the error on the mean ^(r) is significantly larger in 
^-sampled simulations. Q does not not comment on this 
interesting result. I return to this issue in ? IVII 



IV. P(K) RESULTS 

Although ^-sampled simulations are designed to match 
the real-space properties of the cosmological density field, 
it is nevertheless interesting to measure the power spec- 
tra in these simulations as another self-similar numerical 
test. These results are presented in Fig. [6] As seen in 
the ACDM simulations in Q, the oscillations in the ini- 
tial (^-sampled power spectrum largely die away. These 
features are essentially absent in the n = — 1 case while 
in the n ~ —1.5 and n — —2 cases the oscillations about 
linear theory for k/k^i^ <^ 1 seem to persist for longer. 
This is likely explained by the fact that there are more 
factors of expansion between the epoch of the initial con- 
ditions and an output at a given a/a* in the n = —1 
case than the n = —1.5 or n = —2 cases. (As a result, 
the n = —1 case requires much longer execution times to 
reach a/a* ^ 0.5 than n = —1.5 or n = —2.) 

I compare the power spectrum results in Fig. [B] to fit- 
ting functions from higher resolution simulations. For 
n = — 1, I compare with an improved version (23l . Ap- 
pendix A] of the fitting function from the high-resolution 
simulations by |38]. The n = —2 fitting function comes 
directly from [38] and the n = —1.5 fitting function 
(which was not investigated in [s^) comes from 10 P- 
sampled, N = 512^ simulations presented in [2^ Ap- 
pendix A] . Generally, the agreement with the higher res- 
olution non-linear fits is quite good and, intriguingly, the 
final outputs of the ^-sampled results in each case seem 
to agree more closely with the non-linear fitting func- 
tion than the P-sampled case. One might have expected 
the ^-sampled simulations to more accurately handle sit- 
uations with large amounts of clustering power on the 
scale of the box, since the fiuctuating DC mode is more 
true to the large scale clustering in the real universe. 
With the correlation function, it is difficult to see which 
scheme performs better in this regime, since a discrep- 
ancy between the P-sampled correlation function results 
from different outputs - a violation of self-similarity - or 
with some higher-resolution result can potentially be ex- 
plained as a failure of the integral constraint correction, 
which assumes a linear theory model for ^(r) in ^(i?box)- 

The self-similarity of the power spectrum exhibited in 
^-sampled simulations when the clustering levels become 
very high on the scale of the box appears to be one of 
the principal improvements offered by this approach in 
the tests shown here. Ironically, a method designed to 
match the real-space properties of the density field ap- 
pears to more accurately render the fourier-space statis- 
tics in this regime. It is tempting to speculate that the 
difficulties, historically, with realizing self-similar evolu- 
tion in n = — 2 simulations with small numbers of parti- 



10 



P-sampled ICs (n = -1 ; N = 256^; 20 simulations) P-sampled ICs (n = -1 .5; N = 256^; 20 simuiations) P-sampled iCs (n = -2; N = 256^; 20 simuiations) 




3.0 
2.5 
2.0 
1.5 
1.0 



0.5 
10 



a / a, = ' 






0.510 






0.255 






0.191 






• 0.128 i 






0.089 1 




C\J —I 

< 


• 0.038 / 






■ fit to N=512^ results /I - 


In 

< 



10 



10 
k/k 



10 



10 



10 
9 
8 
7 
6 
5 
4 
3 
2 
1 

10 



a / a. = ' 




0.456 




0.342 




0.228 




■ 0.171 




0.114 j 




■ 0.080 




Widrow et al. fit W 



10 ' 



nl 



10 10 

k/k , 

nl 



256''; 50 simulations) 



^-sampled ICs (n 
3.0 



-1.5; N = 256''; 150 simulations) 



2.5 
2.0 
1.5 
1.0 
0.5 



a / a. 



0.510 
0.255 
0.191 
0.128 
0.089 
0.038 

fit to N=512^ results, 




5-sampled ICs (n = -2; N = 256' 
10 



10' 10' 
; 50 simulations) 



10" 



10" 



10" 
k/k 



10' 



10' 



a / a. = ' 




0.456 




■ 0.342 




0.228 




■ 0.171 




0.114 i 




■ 0.080 




Widrow et al. fit W 



10"' 



10" 10 
k/k , 

nl 



10' 10' 



FIG. 6. Comparison of non-linear power spectrum results from P-sampled (top panels) and ^-sampled (lower panels) ensembles 
of simulations for initially n — —1 (left), n — —1.5 (middle) and n = —2 (right) powerlaws. The x-axis is scaled by the non-linear 
wavenumber k^i, defined by A^(fcni) = 1- Solid black lines show results from high-resolution simulations (see text). 




FIG. 7. Non-linear power spectrum results relative to fitting functions to results from high-resolution simulations {A'^g{k); 
black lines in Fig. (B}. Panels are organized as in Fig. [S] The a;-axis is shown relative to fcbox = Svr/Lbox and dashed vertical 
lines show the particle nyquist wavenumber and half the particle nyquist wavenumber. The n = —1.5 results do not extend to 
large k because non-linear fitting functions from very high-resolution (TV 3> 512'') simulations are unavailable in this case. 
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cles could have been alleviated by this method |22| . |42| . 
However, note that the errors on the mean in Fig. [S] are 
considerably larger in ^-sampled simulations with even 
2.5 X the number of realizations than in P-sampled sim- 
ulations. This penalty is severe - at least doubling the 
computational expense to obtain the same errors on the 
mean power spectrum and correlation function. I return 
to this issue in § I VI I discussing whether "paying" this 
penalty is more advantageous than increasing the num- 
bers of particles in the box in simulating the non-linear 
evolution of very steep {n < —2) power spectra as would 
be relevant to studying the evolution of the very large k 
power spectrum in ACDM [43, 44]. 

As a more quantitative comparison in Fig. [7] I show 
the ratios of the non-linear power spectrum results to 
the high-resolution self-similar fits. Bearing in mind, 
again, that the initial power spectrum has oscillatory fea- 
tures from the cutoff in Eq. [21 the ^-sampled simulations 
typically match the self-similar solution on scales close 
to the box better than the P-sampled simulations. On 
scales at or below the initial mean interparticle spacing 
(fc ^ '^Ny.p) all the simulations show evidence that self- 
similarity extends into this regime. The result is most 
dramatic for the n — —2 results where the first out- 
put has very little spectral power at A; ~ ^Ny,p but a 
number of later outputs agree with the self-similar scal- 
ing to better than 10%. A similar trend is apparent for 
n = —1.5 but the non-linear fit comes from much lower 
resolution simulations than n = — 2 or — 1 [23l Appendix 
A]. In Fig. [7] each output is only shown up to the range 
of validity of the high-resolution non-linear fitting func- 
tions. For n = —\, the first output shown was taken after 
the particles moved significantly from their initial posi- 
tions, which is why there is appreciable spectral power 
already at fc « ^Ny,p- In all the simulation results the 
self-similar behavior extends continually further beyond 
the initial mean interparticle spacing as the simulation 
progresses. This echoes a result from [4l|, who, in focus- 
ing on ACDM simulations, state "We see this maximal re- 
solved wavenumber propagates to larger k in time, reach- 
ing a final value of order [the force softening]". Unfortu- 
nately, the dynamic range of Fig. [7] does not extend all the 
way to the force softening scale at fc/fcbox ~ 2560. This is 
primarily due to the limited range of the high-resolution 
non-linear fitting functions. To ameliorate the concern 
that the grid used to measure A^(fc) may be affecting 
the behavior at large fc/fcbox, except for the ^-sampled 
n — —1.5 results, a 1800'^ grid was used to measure A^(fc) 
from each realization. The measurement should therefore 
be reliable out to fc w ^Ny, grid/2 or fc/fcbox ~ 4500 



^ The ^-sampled n = —1.5 results were computed with a 1024'' 
grid to avoid the computational expense of performing an extra 
150, 1800^ FFTs in a case where the non-linear fitting function 
is the bottleneck at large fc/febox- This is a minor detail and 
the P-sampled n = —1.5 results, which were computed with a 
1800^ grid, agree well with the ^-sampled measurements at large 

^/^box- 



Finally, to comment again on the small fc/fcbox behav- 
ior, the P-sampled results show clear trends of under- 
predicting the non-linear power spectrum as the cluster- 
ing level approaches the box scale. Seto [45| proposed 
modeling the inaccuracies introduced from the finite scale 
of the box using perturbation theory but stopped short 
of testing this approach with cosmological A^-body sim- 
ulations. Since the investigation is largely orthogonal to 
the narrative of comparing the P-sampled and ^-sampled 
methods to each other, this study, which finds qualita- 
tive agreement between the n = —1.5 and ri = — 2 results 
and the perturbation theory expectations, is presented in 
Appendix [Cl 



V. HALO MASS FUNCTION AND HALO BIAS 

As aspects of structure formation with their own set of 
numerical challenges [e.g. |4§|, I also investigate the 
halo abundance and clustering bias in my simulation en- 
sembles. I principally compare my results to p6j who 
measured the /(i^) mass function in a variety of differ- 
ent powerlaw models using higher resolution simulations 
than employed here but with far fewer realizations. Using 
halo abundances, [i^ reports their best fit parameters for 
each powerlaw to the Sheth-Tormen fis] form of /(i^), a 
result which can be used to predict both the halo abun- 
dance and the halo clustering using the framework of [4^ . 
Other models for /(z^) exist in the literature as calibrated 
fits to ACDM simulations. I principally compare to pGl ]. 
who focused on pure powerlaw models. 



A. Halo Mass Function 

Fig.[8]shows my primary results for the halo mass func- 
tion. To highlight the self-similar evolution in the upper 
panel of each sub- figure I multiply the halo mass function 
by a factor of (47r/3)P2 = M^/pm where a{M^) = 5c- 
This choice is motivated as follows: the halo mass func- 
tion can be characterized solely as a function of cr(M), 



dn 



Pmdloga ^ 

dlogM ~ M dlogM 



(20) 



where v — 5c/(t{M) and f{v) is calibrated to simula- 
tions or estimated analytically. For powerlaw models 
I. = (Af/Af,)("+3)/6 and Eq. [20] becomes 



dn 



Pn 



dlogM M 



6 



M 



(n+3)/6 ( f M\ 

(21) 

so that in multiplying by A'h/ pm = (47r/3)PiJ this expres- 
sion becomes a unique function of M /M^, . Alternatively, 
this {Att/?>)R^ factor can be thought of as a characteristic 
non-linear volume, the abundance of halos at a particu- 
lar M/M^, halo mass being set by the ratio of the fixed 
volume to the characteristic non-linear volume. For this 
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FIG. 8. Halo mass function results for my fiducial simulation set. Results are shown either scaled by (47r/3)i?i^ — A'h/pm 
(upper sub-panels in each case) or relative to the fitting function of '4C| (lower sub-panels in each case). For each output the 
characteristic non-linear mass (cr(M,) = Sc) relative to the mass of a particle in the simulation is shown in the legend; each 
output is colored consistently with other plots. Error bars show the error on the mean. 



reason the earliest simulation outputs probe the large 
M/M* regime because the fixed volume of the simulation 
is so much larger than (47r/3)i?i^ that rare supermassive 
objects can be more easily found than at later outputs 
when the non-linear volume is significantly larger. 



Halo Bias 



In Fig. [5] both ^-sampled and P-sampled simulation 
sets give remarkably similar results in each case. Divid- 
ing by the prediction from [46] as in the bottom panels 
in each sub-figure does not break self-similarity, and the 
overlap of the mass functions from each output is still a 
test of self-similar evolution that my simulation results 
typically achieve, although, interestingly, the self-similar 
solutions deviate significantly from the fits reported in 
[46|. This can be explained by noticing in the plots shown 
in [igl that the measurements of /(j/) from their simula- 
tions can fall below the best fit f{i/) by up to 50%. Had 
they used more than two free parameters in their fit to 
f{i>) (as they did in adopting the Sheth-Tormen frame- 
work), it seems likely that my results would agree more 
closely with theirs. 



In the Sheth-Tormen j48|| framework the halo bias is 
given by 



6(M) 1 + 



at/ — 1 



2p/6, 



1 + [avY 



(22) 



where a and p are free parameters. Using the best fit 
values for a and p quoted in (46| for each powerlaw, in 
Fig. [S]I compare their results to the measured halo bias 
in my simulations. Although the halo bias can be equally 
well measured in fourier space [i^, [s^ as in real space, to 
avoid the oscillatory features in P{k) I measure the bias 
with b'^{M) = £,hh{r) / £,mm{r) in both P-sampled and ^- 
sampled simulation results, where S.hhi'r) is the halo- halo 
correlation function and ^mm {f) is the matter-matter cor- 
relation function (elsewhere referred to as £,{r)). One 
subtlety with this measurement deserves mention: in P- 
sampled simulations the ^;ih(r) measurements, like the 
S.mmi'f) measurements, suffer from the artificial imposi- 
tion of the integral constraint on the scale of the box. 
In Appendix [Bl drawing upon the procedure for correct- 
ing ^mmlf) described in |23l . Appendix B], I demonstrate 
that this correction cancels (at least to first order) in 



13 



P-sampled ICs (n = -1 ; N = 256 ; 20 simulations) 
10^ 

Jing(1998) 

Sheth STormen (1999) 
Bagia et al. (2010) 
1Q' ^ N-body simulations 



10 



10 




10 ' 



10 



10 

M/M. 



10 



10 



^-sampled ICs (n = -1 ; N = 256^; 50 simulations) 



10' 

Jing (1998) 

Sheth & Tormen (1999) 
Baglaetal. (2010) 
. N-body simulations 




10 



P-sampled ICs (n = -1 .5; N = 256 ; 20 simulations) 
10^ 

Jing (1998) 

Sheth & Tormen (1999) 
BagIa et al. (2010) 
1Q' ^ N-body simulations 



10 




P-sampled ICs (n = -2; N = 256 ; 20 simulations) 
10^ 

Jing (1998) 

Sheth & Tormen (1999) 
BagIa et al. (2010) 
1Q' ^ ■ N-body simulations 



10 



10 



5-sampled ICs (n = -1.5; N = 256 ; 150 simulations) 

10^ I 

Jing (1998) 

Sheth & Tormen (1999) 
Baglaetal. (2010) 
.jgi . N-body simulations 



10 




10 



10 



10" 
M/M. 



10 



10 




10' 



10' 



10 



^-sampled ICs (n = -2; N = 256^; 50 simulations) 



10" 



Jing (1998) 

— Sheth & Tormen (1999) 

— Baglaetal. (2010) 

■ N-body simulations 




10" 



10" 



10" 
M/M. 



10' 



10"^ 



FIG. 9. Halo clustering bias measured with h^{M) — ^hh{r)/^mm{r)- Results from my simulations are shown by magenta 
points; error bars show the measured error on the mean. In each panel the bias functions of Jing [i^ (blue), Sheth- Tormen [31 
(green) and Bagla et al. [i3|(cyan) are shown for comparison. Since the Sheth- Tormen [3] bias has been calibrated to ACDM 
simulations (rather than scale-free models) this prediction is shown only for reference to the familiar ACDM case. 



the measurement of the clustering bias, ^-sampled simu- 
lations naturally circumvent this issue, and no integral- 
constraint corrections are needed to the measured ^hh (r) 
and ^mm(r). 

Generally, in Fig. [9] the bias measurements from the 
two methods agree within errors and the results typically 
stay closer to the predictions rather than the formula 
from [13 . (The Sheth- Tormen ^ bias, calibrated to 
ACDM simulations, is overplotted for comparison.) At 
large M/AI^, in particular the measurements typically fall 
below the fitting function of [13], as previously observed 
for n = — 1 in [36] . who point out that the fit deviates 
slightly from the original simulations in [47]. My results 
(and indirectly the /(i/) results of [i^) corroborate this 
conclusion. 

Of particular note in Fig. [9] the ^-sampled bias for 
n = — 1 shows substantially larger error bars at low 
M/M^: than the P-sampled case. This discrepancy stems 
from the large variance of £,mm (f) and deviation from the 
self-similar fit, as seen in Figs. El & m rather than the 
^hhif) measurement. 

Taken together the results in this section assert that 
the process of halo formation is consistent between the 
two methods, at least for the measurements I show here 
(see also [19]). 



VI. BOX-TO-BOX VARIANCE 

Having explored the ensemble-averaged predictions for 
the mean ^(r), P{k), and halo mass function, in this sec- 
tion I turn to comparing the box-to-box variance of these 
quantities with expectations from theory. As emphasized 
by 5] , it is important to be able to model both the mean 
dark matter and/or halo statistics and the covariances 
of these quantities to infer the appropriate cosmological 
constraints from a set of large-scale structure data. In 
this vein, there are a number of group s that have taken 
up this concern for ACDM [^, [siI - Im ]. By comparing 
gaussian and other expectations for the box-to-box vari- 
ance of these statistics, I aim to compare these predic- 
tions to a more diverse range of cosmological models with 
the added simplicity and clarity of powerlaw setups. 



A. Var ^ results 

Fig. 1101 shows my main results for the box-to-box vari- 
ance of the correlation function, in each case multiply- 
ing by a well-motivated power of (ibox/'^o) to normalize 
the y-axis. In each plot I compare with the expected 
Gaussian variance for a volume V — L^^^ as well as an 
estimate of what I refer to as a "floor" to the variance 
from non-linear effects (i.e. higher-order statistics) pro- 
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FIG. 10. Measurements of the box-to-box variance of ^(r) (colored points) compared to expectations from Gaussian statistics 
(dashed colored lines). Open circles show a jackknife estimate of the variance from breaking up each simulation box into octants 
and measuring the correlation function in each sub-volume. The anticipated contribution to the variance from higher-order 
correlations (as predicted by "Hyper- Extended" PT [s^) is shown with gray dot-dashed lines. The y-axis in each plot is scaled 
by (ibox/fo)'"^^''^^ so that this expected "floor" to the box-to-box variance is a fixed horizontal line. Also shown is the Poisson 
noise level (colored dotted lines) which only happens to be large enough to be visible for the first outputs in each case and, for 
the ^-sampled results in the lower three panels, a black dot-dashed line shows the result of Eq. 1301 



vided by Hyper Extended Perturbation Theory [HEPT; 
[ssj . This term produces a fractional error proportional 
to the volume averaged correlation function, 

= v/4(l-2Q3 + Q4mox) (23) 

where and Q4 come from HEPlj^. Note that the 
scalings in Fig. [10] are designed to cancel with the 
VsC^box) ~ (''oZ-^box)^"'*''^^^^ dependence of this term so 
that the "HEPT floor" is a horizontal line in each plot. 
Although not expected to be an extremely good match 
to simulations, I include it in Fig. [10] for qualitative com- 
parison to my results. 

The Gaussian variance is given by 

[53} . which, for powerlaw, P{k) = Ak"-, models can be 



The Journal version of this paper contains a typographical error 
in the equation for Q4. The arXiv version is correct or, c.f., [56| . 



shown to be equivalent to 



cr^ _ An v/r(l + 2n)sin7i7r / r \ 
epow(r) " IT 4("+i)/2 Uboxj ^ ' 

where A = yl„rQ^^, and r(l + 2n) is the usual gamma 
function. Notice that all of the tq dependence has can- 
celed out with the division by ^pow('') — {ro/r)^^^- Un- 
fortunately, Eq. [55] is only convergent for the limited 
range of —1.5 < n < —0.5. Therefore, the Gaussian 
variance predictions for the n — —1.5 and n — —2 re- 
sults in Fig. [TU] include cutoffs in P{k) at k = fcbox- As 
a correction to a divergent result, the poor comparison 
of the expected Gaussian variance for these two cases is 
unsurprising. 

In Fig. 1101 the ^-sampled results clearly exhibit much 
larger box-to-box variance than the P-sampled simula- 
tions. The ^-sampled n — —I results are perhaps most 
instructive since the Gaussian variance is accurate in this 
case - the measured variance is clearly much higher than 
this expectation. Though initially quite puzzling, the 
box-to-box variance of ^(r) turns out to be almost en- 
tirely dominated by the variance of the DC mode, so 
much so that it is as if the (auni/flbox)^ factor in Eq.[3]is 
the direct cause of the large variance over a wide regime 
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FIG. 11. Focusing on the box-to-box variance of ^(r) for 
the P-sampled, n = ~\ simulation ensemble (top left plot in 
Fig. [To]). The y-axis is shown without the (I/box/ro)'"+^'''^ 
scaling used in Fig. [TU] and the x-axis shows the scale, r, of 
the correlation function measurement relative to the box scale 
so that the expected Gaussian variance (dashed black line; 
Eq. 1251) appears as a fixed line. Results from including a low 
k cutoff at = 27r/Lbox in the expected Gaussian variance 
are shown for comparison as a solid gray line. 



without any interference from the variance of the r(r) 
term. More explicitly, since abox,i/auni = 1 ^ 



flbo 



1 + A, + -A,^ 



(26) 



and referring to '^uni,i('') as the result of scaling fbox,i('') 
with the appropriate weighting. 



2 V{r\ 

euni,i(0« (1+A. + -A2)-^ 



1 



(27) 



r(r) 



r(r) 



1 + -^A,; + -A? 



2^2r(r) 
3 



= Cbox,iW + (1 +ebox,i(r-))A, + (1 + Cbox,iW)3A? . 

Ignoring, for the moment, the A^ term, 

Cunia(r) « aox,i(r) + (1 + ebox,i(r))A, (28) 

and computing the variance of ^uni.i while assuming that 
(Aj) = (i.e. that the mean of the DC mode is equal to 
zero) it can be shown 

(eL.i) - (eu„ia(r))^ « (1 + 2(Cboxa(r)))(A2) (29) 

2«+27r ' - ^ 



(l + 2(ebox,i(r)))- 



L 



box 



(30) 



where in the last step I used the fact that the variance 
of the DC mode, (Af), is equal to Prcai(0)/ij^^^. Us- 
ing black dot-dashed lines, I compare this formula, with 



much agreement, to the ^-sampled results in Fig. [TO] An 
important detail in this derivation, which was lost in 
dropping the A? term in Eq. 1281 is that the mean correla- 
tion function (i.e. (^uni.i (?"))) will still retain a (2/3) (Af) 
term. This term is schematically very similar to the in- 
tegral constraint correction described in psl Appendix 
B]. The difference is that in ^-sampled simulations this 
correction is, in a sense, applied individually to each sim- 
ulation with a large scatter, rather than statistically as a 
correction to the ensemble-averaged correlation function. 
The n — —\ case, where the expected variance of ^(r) 
from Gaussian statistics is still well defined, is decisive 
for assessing which approach, if either, is more accurate. 
With this in mind, the lower left panel of Fig. [TU] shows 
that the variance from the ^-sampled method substan- 
tially exceeds the Gaussian expectations and the upper 
left panel of Fig. [10] argues that the results from the P- 
sampled method fall significantly below this expectation. 
This latter conclusion is more clearly shown in Fig. [TT] 
which shows the P-sampled variance results for n = —\ 
relative to the scale of the box. As in Eq. [25] the Gaus- 
sian variance in Fig. [TT] is independent of Tq. On the 
largest scales, the measurements from simulations sys- 
tematically fall below the Gaussian expectations while 
on smaller scales non-linear (i.e. higher order) contribu- 
tions to the variance dominate and the comparison to 
pure-Gaussian statistics is no longer appropriate. The 
suppression of the variance of ^(r) on large scales closely 
resembles the prediction from Eq. [24] using a lower limit 
of fc = 27r/Lbox (solid gray line). 

Note that ^5?'] found good agreement comparing Eq.[25l 
to the covariance of the halo-halo correlation function. I 
point out that their comparison was made for r/Lbox ^ 
0.1, where my results match the gaussian expectations 
without a low k cutoffI3 



B. Var P(k) results 

As discussed in the previous section the ^-sampled 
method features large box-to-box variance in the corre- 
lation function. Fig. [T^ shows the box-to-box variance of 
the power spectrum results in my simulation set. Unsur- 
prisingly, the variance of the ^-sampled results is much 
greater than the variance of the P-sampled results, which 
lie closer to the black-dashed line indicating the gaus- 
sian uncertainty from the number of modes at each fc, 
i.e., Vari/2(p(fc))/p(fc) ^ 2^1^^"^. Overlaid for the P- 
sampled simulations in dot-dashed lines is the expected 
variance from non-linear effects related to the amplitude 
of the trispectrum 52] , the fourier-space analogy of the 



n.b. [531, focusing on halo statistics, required careful bin- 
averaging of the sin(fcr)/fcr terms in Ea. l25l to achieve this agree- 
ment. In my matter correlation function results the bin widths 
are sufficiently small that this effect is negligible. 
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"HEPT floor" in Fig. [TOl This term depends on per- 
turbation theory results which are not expected to be 
extremely accurate in practice. Nevertheless the n = —1 
results agree reasonably well, while the n — —1.5 and 
n = — 2 cases are much less convincing. 

Though this source of variance should still be a con- 
tribution to the variance in ^-sampled simulations, it is 
nevertheless small and I omit this term in the lower pan- 
els of Fig. [T^ for clarity, showing instead an estimate of 
the "beat coupling" expectation from [s^l- I apply this 
term in much the same way as [l^, directly using the 
variance of the DC modes in the simulations rather than 
PL(2fcbox) to estimate this effect which is only relevant to 
simulations being used to produce mock catalogues, jll] 
find that the measured amplitude of the beat coupling 
term is ~ 15% below the expectation from HEPT. Com- 
pared with my results, the simulation measurements are 
closer to ~ 50% lower variance than this expectation. 
To make the most accurate mock catalogues possible, 
with accurate covariances, this effect should be studied 
more carefully. I will return to this issue in future work. 
The interested reader should consult [11] and references 
therein for a recent consideration of these issues. 



C. Var hmf results 

For predicting the abundance of galaxy clusters or for 
more general use, high precision predictions of the halo 
mass function are becoming increasingly important for 
inferring the original, linear density field (containing the 
cosmological information) from observations of halo num- 
ber counts. Here again, there exists an expectation of the 
variance from a sample volume which can be compared 
to the simulation results. Derived originally by [59j], the 
variance in the halo mass function should be simply re- 
lated to the Poisson noise and a sample variance term, 

a^ = ^^%^ = ^ + 6V(i?bo.) (31) 
UhV 

where cr(i?box) is the rms variance in the matter overden- 
sity in the sample volume. The first term in Eq.[3T]is the 
Poisson error from number counts. Separating these two 
sources of error in Fig. [T3] I plot separately the Poisson 
noise with dotted lines and the expected sample variance 
with a black dot-dashed line. The y-axis is normalized 
by cr(i?box) so that the "sample variance" curves in this 
plot is simply equivalent to the bias, 6(M/M*). I assume 
the Bagla et al. [i^ model for the bias as discussed in 
the previous section. 

The measured variance in P-sampled simulations 
seems to fall well below the sample variance estimate; 
the Poisson errors dominate in all regimes. This has been 
noticed before in a few ACDM studies [13, [Ml The mea- 
sured variance in the ^-sampled simulations do seem to 
give different predictions for the variance, although, con- 
fusingly, they do drop significantly below the expected 



sample variance for low M/M*. It is not entirely clear 
why this is and I defer the topic to future work. 



VII. SIMULATING EXTREMELY INFRARED 
SPECTRA 

In a suite of tests I demonstrated the accuracy of the 
^-sampled method for the ensemble-averaged statistics 
of £,{r), P{k), the halo mass function and the halo clus- 
tering bias, finding good agreement with the expected 
self-similar scaling of these statistics and agreement with 
results from P-sampled ensembles. Important differences 
emerged, however, in comparing the box-to-box variances 
of those quantities, with the ^-sampled method predict- 
ing significantly larger variance. For the Var ^(r) results 
this variance greatly exceeds the expectations from Gaus- 
sian statistics and an estimate for the contribution to the 
variance from higher-order correlations. Deferring, for 
now, the question of modifying the ^-sampled method, 
this result implies that this approach is substantially sub- 
optimal for calculating the mean ^(r) and P{k) in most 
situations of interest. 

A possible exception to this, noting the remarkably 
robust self-similar behavior of P{k) for n = —1.5 and 
—2, is in cases where one is interested in extremely red 
power spectra (rieff ^ —2) . This is a particularly diffi- 
cult to simulate regime because the large-scale clustering 
level rapidly grows to the scale of the box, causing de- 
viations from the true non-linear behavior. Physically, 
this regime is most relevant to small box simulations for 
modeling the "end" of the CDM hierarchy [H, Sil or (if 
the iV-body method is extended to include hydrodynamic 
effects) small box cosmological simulations of galaxy for- 
mation. 

To test whether the ^-sampled method can successfully 
model the self-similar behavior of a power spectrum red- 
der than n = — 2, I ran tests with an initial n — —2.25 
power spectrum using both ^-sampled and P-sampled 
methods and similar numerical parameters to the simula- 
tions presented elsewhere in this paper. For comparison, 
[ssj l present P-sampled simulations of n = —2.25 with 
much higher resolution. 

A number of issues - both technical and fundamen- 
tal - make this a challenging problem for the ^-sampled 
method. In the n = —2.25 case there is no conve- 
nient analytic or special-function form of Eq. [2] to aid 
in generating the ICs and, instead, the large dynamic 
range of the oscillations in Prcai(fc) must be captured 
entirely by numerical integration. Inaccuracies in nu- 
merical integration can also shift the distribution of 
values slightly away from gaussian, an issue that can be 
fixed by generating the ICs for each realization (i.e. ran- 
dom phases and values for the entire ensemble) before 
evolving the particle trajectories and checking first to en- 
sure (or otherwise choosing by fiat) that (A;) sa and 
(Af) w Pi.cai(0)/Lbox- For the first constraint, since A^ 
is a gaussian variable, the error on (A^) is simply related 
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FIG. 12. Results for the box-to-box variance of P{k) in my fiducial simulation ensembles. In each plot I show in black 
dashed lines the expected variance from mode counting (= 2N'^^^^). For the P-sampled simulations in the upper three panels 
I additionally show with colored dot-dashed lines the expected contribution to the variance from higher-order correlations [s^ ]. 
while for ^-sampled simulations in the lower three panels( which, with fluctuating DC modes, are more analogous to surveys), 
I show expectations from the "beat coupling" result of [5J] and omit the expected variance from non-linear effects for clarity. 



to the variance, so |(Ai)| < 0.01 places the constraint, 

> . 100^/^. (32, 

Notably the earlier n = —1.5 results did not initially meet 
this criterion and many more realizations were required 
while the 50, n = —1 simulations presented here do not 
meet this constraint, which causes the discrepancy noted 
previously at large r/Lbox in the lower left-hand panel of 
Fig. |4l A more model-independent lower limit on A^sims 
comes from considering that there must be sufficiently 
many realizations to ensure that the fractional error on 
(A^) is small, i.e., 

(^i,,^?!;^.^ ,33, 

-^*sims -^^siins 

For a maximum fractional error of 4% this implies 
Asims > 25%/3 « 43, which is comparable to the num- 
ber of ^-sampled realizations in the tests reported in this 
paper. 

Any ensemble of ^-sampled simulations should fol- 
low the guidelines just mentioned. A peculiarity of ex- 
tremely red initial power spectra is that Eq. [T^] can give 
-frcai(fc) < at certain k values if n < —2. This means 
that the "goal" correlation function does not correspond 



to a physically realizable density field. In the n = —2.25 
results shown here, I set -Prcai(fc) = whenever negative 
Picai{k) values are reported. The consequences of this 
choice will be discussed shortly. 

Fig. [T3] compares results from 20 P-sampled simula- 
tions of n = —2.25 to 45 ^-sampled simulations that have 
been randomly selected to ensure that the distribution of 
Ai values is sufiiciently gaussian. Also shown for compar- 
ison is the non- linear fitting function for n = —2.25 from 
one N = 1586"^ P-sampled simulation presented in 38|f| 
This fitting function should be more accurate than either 
the P-sampled or ^-sampled simulations which were per- 
formed with far fewer particles (A^ = 256^). Note that, 
unlike the plots shown in Fig. [6l the measured error on 
the mean is shown for each output in the P-sampled case 
and for the ^-sampled results the measured error on the 
mean is shown for the last output; error bars from the 
earlier outputs are omitted for for clarity. 

Comparing the last outputs in each case, the most 
striking result is that at high clustering levels (large a/a*) 
the P-sampled approach strongly deviates from the locus 



The paper contains a typo in the non-linear fitting function for 
n = -2.25 cited in Table 2. [3 should be 1.068 instead of 0.658 
(L. Widrow private communication). 
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FIG. 13. Measurements of the box-to-box variance of the halo mass function in our fiducial simulation set. As in Fig. [H the 
legend shows outputs in terms of the characteristic non-linear mass, M*, and the color scheme of the outputs is consistent with 
other figures. The y-axis is scaled by a(J?box) = <5c(J?box/^?»)~'"'''^'''^ so that the sample variance expectation (in Eo. I3ip is a 
fixed line in each plot. Colored dotted lines show the expected variance from Poisson noise (i.e. uncertainties from counting 
the number of clusters). 




FIG. 14. Comparison of ^-sampled (left) and P-sampled (right) simulations of an extremely red, initially n — —2.25 power 
spectrum. As in Fig. [6] the x-axis is scaled by the non-linear wavenumber, fcni, and solid black lines show results from high- 
resolution simulations (see text). Error bars indicating the measured error on the mean for each point are shown for all outputs 
in the P-sampled case while for the ^-sampled case only the measured error on the mean for the last output is shown for clarity. 



of self-similar behavior whereas the last output of the ^- 
sampled case is still roughly consistent with the results 
from earlier outputs. 

The offset between the high-resolution fitting function 
and the ^-sampled results in Fig. [5] seems to be related 



to the zeroing of Pi.eai(A:) over certain ranges which was 
required to make a consistent set of initial conditions. 
Fig. [15] presents the results of numerically fourier trans- 
forming Prcai(fc) into a correlation function, i^FTroai('') 
(Eq[T]), for n — —2.25. The zeroing of Prcai(fc) causes 
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FIG. 15. Results from numerically fourier transforming Prcai(fc) into a correlation function, 5FTrcai(r) (Eq.[T|. In the left panel 
CFTrcai(r) (gray points and circles) is compared to the "goal" n = —2.25 correlation function (black dashed line). The right 
hand panel divides ^FTrcai(»') by the linear theory powerlaw for n — —2.25. 



CFTroai(?') ^ for T > Lbox/2 and S,FTi-ca.\{r) > ^L{r) 
for r < Lbox/2. Interestingly, the ^-sampled simulation 
results can be brought into agreement with the high- 
resolution results from ^SSj if one concludes that the ^- 
sampled simulations were initialized with a linear the- 
ory amplitude that was about 10% too high. Given the 
deviation of ^FTreai('') from the linear theory powerlaw 
above the "goal" correlation function, shown in the right 
panel of Fig. [151 this seems like a plausible argument ~ 
namely that the zeroing of Preai(fc) can be compensated, 
to first order, by a shift in the amplitude of linear theory. 
However, in detail, the deviations of ^FTrcai(?') from the 
"goal" correlation function in Fig. [T5] clearly depend on 
scale. Therefore, to simulate extremely red power spectra 
with more success than was achieved by the n — —2.25 
results in Fig. [HI one must carefully make some assump- 
tion about ^(r) ^0 for r > Lbox/2 in order to keep 
-Preai(fc) > and maintain the most correspondence be- 
tween ^FTreai('') and the "goal" correlation function. For 
extremely red power spectra or ACDM initial conditions 
with small box sizes a ^-sampled approach optimized for 
this problem would therefore seek to minimize \^{r)\ for 
r > Lbox/2, rather than to simply setting the function to 
zero. 

Assuming that a robust framework for performing this 
minimization can be constructed and that yet steeper 
power spectra can likewise maintain self-similar behavior 
even when a/a* ~ 0.4, the ^-sampled approach appears 
to be the only practical way of simulating rics ~ —2.5. 
Following arguments in [s^ , let us assume that the small- 
est scales accessible to a simulation is set by half the 
particle nyquist wavenumber. 



ing we are interested in scales up to fcmax/^ni = 20, the 
number of particles required is 



(34) 



Z ^-t^box 4 

Noting that fcni/fcbox = (a/a*)"^/'^"+^^ [38l|, and assum- 



iVi/3 = 4 



fcmax \ / ^NL \ / O 



= 80 ( — I .(35) 



a* 



For n = —2.5 and assuming that the P-sampled method 
can work reliably out to a/a* w 0.25 this implies N 
20480'^, significantly beyond the range of current super- 
computers. By contrast, if a ^-sampled approach can 
successfully capture the self-similar behavior of P(fc) out 
to a/a, « 0.45 (as in Fig. [T4)) . the requirement relaxes 
io N K, 2000'^, which is well within the capabilities of 
current state-of-the-art machines. 

Unfortunately, achiving a similar feat for small box 
simulations using ACDM initial conditions would in- 
volve a number of complications related to the diffi- 
culties already mentioned for powerlaw initial condi- 
tions. For example, characterizing the ACDM linear 
correlation function with an effective powerlaw index, 
Cacdm('') = (^^/^)«otf(i-)-i-3 (analogously to the pure 
powerlaw case in Eq. Ilip . one finds that ricff < — 2 on 
scales as large as r ^ 2.5h^^ Mpc. This implies that for 
-^box ^ 2.5h~^ Mpc, much like the ^-sampled n = —2.25 
pure powerlaw case, the initial power spectrum from 
Eq. [2] is negative for certain k values. Another prob- 
lem that arises for small box ACDM simulations is that 
the need to ensure that the average overdensity of the 
ensemble is close to zero, as expressed in Eq. [32l leads 
to a prohibitively large number of simulations (see Q 
Fig. 4 for the scaling of (Preai(0)/Lbox)^^^ '^i^h box size 
in ACDM). These conclusions point to the inadequacy 
of the ^-sampled method for general use or even for the 
niche application of small box simulations. The next sec- 
tion explores possible modifications/improvements to the 
method. 
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VIII. DISCUSSION 

Although varying the DC mode in ^-sampled simula- 
tions correctly and naturally mends the agreement with 
^^(r) on the scale of the box, this feature comes at the 
cost of running many more simulations to reach the same 
precision on the mean ^(r) as in conventional P-sampled 
simulations. The indication from comparing the mea- 
sured box-to-box variance of ^(r) to predictions from 
gaussian statistics shows that this variance far exceeds 
these expectations and therefore the ^-sampled method 
is substantially sub-optimal for predicting the mean ^ (r) . 
Nevertheless, there are still compelling reasons to opt for 
the ^-sampled approach (0, HJ, IVIip . Can a modified 
^-sampled approach deliver the best of both ^-sampled 
and P-sampled methods? 

In an effort to explore this possibility, I experimented 
with different choices for abox/iuni- In particular, one can 
instead choose to set abox so that instead of ensuring that 
the age of the universe in each box is approximately the 
same, as for the choice abox,i/auni = 1 — one can 

match the amplitude of the linear theory growth func- 
tion in each box. For an fi„i = 1 cosmology with no 
dark energy (as in the simulations presented throughout 
this paper) it can be shown that the choice Obox.i/iuni = 
1 — (4/7)Ai meets this requirement to first order. Tests 
using this 4/7 factor, rather than the original 1/3, fur- 
ther increased the variance in ^(r) (with the the box 
size fixed). Since 4/7 is larger than 1/3, a re-derivation 
of the result of Eq. [501 only increases the variance ^(r) 
about the mean, thus working in the opposite direction 
of the desired effect. Another, less dramatic, modifica- 
tion to the ^-sampled method was made where instead 
of a linearized formula for Obox given Ouni and A as in 
Eq. [3 flbox was chosen by ensuring exactly that the age 
of the universe in a given box matched the age of the uni- 
verse at the epoch of interest in the "uni" cosmology, i.e., 
*box(abox) = tuni(auni)- Although Obox can deviate sub- 
stantially from auni, especially at late times, this change 
did not substantially affect the measured box-to-box vari- 
ance of ^(r). 

Another avenue to modifying the code to decrease the 
variance would appear to be to change the assumption 
that ^(r) = for r > Lbox/2 and instead allow ^(r) to 
be non-zero for a larger fraction of the box. In Q the 
choice of Lbox/2 was somewhat arbitrary and a closer in- 
spection reveals that the rms overdensity in spheres of 
volume equal to i^box deviates from the variance in the 
DC mode in simulation boxes with the Lbox/2 assump- 
tion at the ±10% level. However, no fixed assumption for 
the fraction of the box scale beyond which £^{r) — can 
bring these two statistics into perfect agreement and, for 
some powerlaws, correcting for this would imply setting 
^(r) = for r > Lbox (i.e. well beyond the box scale). 
For other powerlaws the comparison to the rms over- 
density implies that the DC variance is slightly under- 
predicted. Clearly the Lbox/2 assumption is close enough 
to the true variance that an explicit correction would not 



uniformly move the results in the direction of more re- 
alistic box-to-box variance on the quantities of interest. 
Furthermore, it can be shown that changing the fraction 
of the box scale can not fix the the problems that arise 
with extremely red initial power spectra or sufficiently 
small boxes (Lbox ^ 2.5h~^ Mpc) using ACDM initial 
conditions, which leads to negative values for Proai(fc) at 
certain k. 

Having exhausted these possibilities to improve the 
^-sampled method it would appear that one must sim- 
ply abandon the need to set the variance of the DC 
mode exactly as (Af) = L'rcai(0)/L^Q^ and instead use 
some diminished fraction of Preai(0)/L^Q^. Essentially, 
the only way to retain the advantages of the P-sampled 
and ^-sampled methods without dramatically increasing 
the number of simulations required is to allow the DC 
mode to to vary from simulation to simulation but not as 
broadly as the ^-sampled method would prescribe. Sim- 
ulations using this hybrid approach might see the sup- 
pression of P(fc) (Fig. [7]) and the measured variance in 
£_{r) (Fig. [TT]) near the box scale but not as strongly 
as before. So too the agreement between the expected 
variance in halo number counts and the measurements 
from simulation might improve but not to the point of 
complete reconciliation. This approach would involve 
certain subtleties regarding the correction of the statis- 
tics of the cosmological density field described in § III CI 
and Appendicies |X] & |B] In particular the correlation 
function measurement would include a correction from 
both the integral constraint and the (ouni/obox)'^ correc- 
tion from the ^-sampled method. To be sure, this hy- 
brid method would require somewhat more simulations 
to reach a given precision on the mean ^ (r) but only be- 
cause it is being used to correct for the suppression of the 
variance of ^(r) (Fig. [TT|) . A hybrid method along these 
lines would indeed have general applicability (for power- 
law ICs or ACDM), especially in cases when simulating 
more than one box to accumulate statistics is already a 
goal. 

Although this fundamental change to the variance of 
the DC mode might seem poorly motivated, bear in mind 
that the variance in ^(r) in the lower panels of Fig. [TUI 
was found to be dominated by the (ounit/abox)'^ correc- 
tion fator. If the variance of the DC mode was taken 
as a sufficiently small fraction of Preai(0)/Lj^Qj^, the vari- 
ance introduced by this correction would decrease and 
this could be used to mend the agreement with the gaus- 
sian statistics in Fig. 1111 Introducing this parameter to 
suppress the variance in the DC mode simply implies a 
continuum between ignoring the DC mode, as in conven- 
tional P-sampled simulations, and fully including it as 
in the original ^-sampled method. This would not be a 
free parameter in the sense that both the suppressed vari- 
ance of ^(r) in P-sampled simulations (Eq. 1241 including 
a cutoff at /cbox) and the excess variance in ^-sampled 
simulations (Eqs. are well understood so that the 

parameter would be uniquely set to reconcile as much as 
possible the the measured variance of ^(r) with the ex- 
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pectations for an infinite box. Though a very promising 
Une of inquiry, it is beyond the scope of this paper to fur- 
ther investigate this hybridization of the ^-sampled and 
P-sampled methods and I defer this to future work. 



IX. SUMMARY 

This paper presents extensive tests demonstrating the 
accuracy of running ensembles of cosmofogical iV-body 
simulations both using conventional P-sampled ICs, as 
well as so-called ^-sampled ICs, which was originally con- 
ceived by Pen Q and implemented by Sirko Q. No- 
tably, I found that the conventional P-sampled results 
presented in Q suffered from an integral constraint bias 
on the scale of the simulation box. Correcting for this 
measurement bias substantially improves the P-sampled 
^(r) predictions, increasing the amplitude of i^(r) by al- 
most a factor of two near the box scale, so that both 
^-sampled and P-sampled ensembles are unbiased esti- 
mators of the mean ^(r) and a key disadvantage to the 
P-sampled method is removed. 

In a suite of tests I used powerlaw initial conditions 
(with ^lrn = Ij^^de = 0) SO that the scale-free nature of 
these models, which gives rise to self-similar evolution 
of the cosmological density field, could be used to assess 
the accuracy of the simulations on different scales and 
to provide an absolute reference for when for when the 
predictions of ^-sampled and P-sampled ensembles dis- 
agree. Tests of this nature proved decisive in confirming 
the accuracy of the first gen eration of (P-sampled) cos- 
mological A^-body codes [221 . I focus on n = —1,-1.5 
and —2 powerlaws as well as a B AO-like powerlaw setup 
introduced by [1^ where powerlaws ofn = — 0.5,— 1 and 
— 1.5 are modulated by a gaussian bump feature in real 
space, resulting in sinusoidal oscillations around the pow- 
erlaws in fourier space. Additionally, mindful that the ^- 
sampled code has been used to generate mock catalogues 
, I investigated the halo mass functions and halo clus- 
tering bias in the pure powerlaw simulations, comparing 
to the results of [46[ . In an appendix I also compared the 
power spectrum measurements from the P-sampled sim- 
ulations to perturbation theory predictions for the sup- 
pression of the growth of structure caused by the finite 
box scale, as originally proposed by fisj . 

Generally, in the tests presented here using N = 256"^ 
particles and the Gadget2 code, the ^-sampled and P- 
sampled methods agreed on the mean £,{r), P{k), halo 
mass function and halo clustering bias, ^-sampled sim- 
ulations of the BAO-like setup also agreed well with the 
^(r) measurements from simulations presented in 23], 
indicating that the shift and broadening of the BAO fea- 
ture are accurately modeled. Insofar as the ^-sampled 
and P-sampled methods are equally valid approaches to 
making predictions for the non-linear growth of struc- 
ture, these results reassuringly imply that non-linearities 
are robustly predicted by both of these methods. A dif- 
ference, for example, between the P and ^-sampled pre- 



dictions for the BAO shift would have been very concern- 
ing. Another interesting conclusion, with similarly broad 
implications, is that the measured correlation function 
in both P-sampled and ^-sampled simulations with pure 
powerlaws matches the self-similar solution on scales be- 
low the initial mean interparticle spacing at late times, 
after the particles had moved significantly from their ini- 
tial positions. Although emphasizing the problems intro- 
duced by aggressive force softening such as that employed 
here, [41] find a similar trend in ACDM simulations and 
the results shown here extend that result to a more di- 
verse range of initial power spectra. 

Significant differences between the ^- and P-sampled 
methods emerged for the box-to-box variances of those 
quantities and the larger box-to-box variance of the ^- 
sampled results required significantly more simulations 
(> 2.5 times more; throughout I typically compare en- 
sembles of 20 P-sampled simulations to 50 or more ^- 
sampled simulations) to achieve a given precision on the 
mean ^(r) and other statistics. By comparing to the 
expected variance of ^(r) assuming gaussian fields and 
linear theory, the n = —1 results unambiguously showed 
that the variance of ^(r) significantly exceeds this ex- 
pectation so that the ^-sampled method is substantially 
sub-optimal for predicting the mean ^(r). By contrast 
the P-sampled method was shown to have a box-to-box 
variance in ^(r) that can be as much as a factor of two 
smaller than this expectation on scales approaching the 
box size. Cutting off the integral in the gaussian predic- 
tion at /cbox = 27r/ibox (Eq.[Ml) adequately modeled this 
important deficiency of the P-sampled method. 

Digging deeper into the source of the large variance 
of ^(r) in ^-sampled simulations, it was revealed that 
this variance comes predominantly from the (auni/abox)"^ 
correction term in the ^(7') measurements. This connects 
the number of ^-sampled simulations needed to reach a 
given precision on the mean ^(r) to the requirement that 
the mean overdensity in all the simulation boxes, (A^), 
is approximately zero and that (Af) « -PrcaKO)/^^^^' 
it must be in the iVsims — >■ 00 limit. These issues caused 
the measured mean ^(r) in rt = — 1 simulations to move 
well away from linear theory and the self-similar solution 
on scales approaching the box because of an insufficient 
number of simulations (iVsims = 50 jl. 

Having completed these tests the principal advantage 
of the ^-sampled method was in predicting the mean P{k) 
when the clustering becomes comparable to the size of 
the box. This is an important concern when simulat- 
ing extremely steep power spectra (riefi < —2) as is the 
case when investigating the "end" of the CDM hierar- 
chy with iV-body simulations [i^, H^j or, e.g., in sim- 
ulating small boxes to study galaxy formation in the 
early universe. As a further test of the two methods 



^ This problem does not affect the conclusions mentioned in the 
previous paragraph regarding the box-to-box variance of ^(r) in 
the n = —I case 
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in this regime I simulated an initially n = —2.25 power- 
law spectrum, and compared to the high resolution re- 
sults in [sl]. The results showed some evidence that the 
^-sampled method maintained self-similarity better than 
simulations using the P-sampled approach but overall did 
not prove steeply falling power spectra to be the "killer 
app" for the ^-sampled approach. This investigation also 
noticed that the shape of the ACDM correlation function 
is flat enough at small scales that for ibox ^ 2.5 ft-^^ 
Mpc the method will fail because Eq. [5] would imply 
-Ficai(fc) < for certain k values. These conclusions all 
pointed to the need to modify the ^-sampled approach 
for more general use. 

A number of modifications to the ^-sampled approach 
was discussed and investigated. Changinging the rela- 
tionship between the scale factor of interest in the box 
given the overdensity and the scale factor of interest in 
an asymptotically larger box only increased the variance 
of ^(r) and P(fc), contrary to the desired effect. Other 
modifications were ruled out such as changing the scale 
beyond which ^(r) = from Lbox/2 to some other frac- 
tion of the box size. 

A modification that should work, however, is to su- 
press the variance of the DC mode in a kind of hybrid 
approach where the simulations are run with less-than- 
prescribed scatter around the desired cosmology and the 
correlation function is measured with corrections arising 
from both the P-sampled and ^-sampled methods. The 
suppression of the DC mode can be straightforwardly re- 
lated to a decrease in the measured box-to-box variance 



of ^{r) until the gaussian expectation of the variance of 
^(r) is met. Since the suppression of the variance of ^(r) 
in a finite, P-sampled box is well understood the pa- 
rameter that suppresses the variance of the DC mode 
is uniquely defined. (The measurement of other statis- 
tics, e.g. P(fc) or halo bias, would need to be considered 
carefully as well.) On a basic level, introducing this sup- 
pression merely proposes a method in which there is a 
continuum from completely neglecting the DC mode, as 
in P-sampled simulations, and fully including it as in the 
^-sampled method. Such a method would be generally 
applicable (for powerlaws or ACDM ICs) and might alle- 
viate some of the known shortcomings of the conventional 
P-sampled approach 0, HJj . 
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Appendix A: The Integral Constraint on ^(r) in P-sampled Simulations 

An important but sometimes neglected subtlety in measuring the correlation function in standard, P-sampled 
simulations is the artificial imposition of an integral constraint on the measured £(r ) on the scale of the simulation 
volume. This effect is entirely orthogonal to the question of which estimator [13, IH, etc.] converges most rapidly to 
the true ^(r) in the presence of Poisson noise. The problem stems from using a finite volume with a fixed number 
density of particles n to estimate £,{r). Since, in a simulation box, the integral over the number density of particles 
separated by distance r (r(r) in Eq. [5]), over the simulation volume yields the number density of particles, 

/ r^T{r)dr = n, (Al) 

"'Vbox 

with ^ (r) in its original form in Eq. |S1 this implies 

/ r^U.Ur)dr - (A2) 

regardless of whether this is the case with the true ^(r). In finite volumes in the real universe n will vary from 
box-to-box and, importantly, the average of (which appears in Eq. will deviate from the inverse of the true 
(i.e. asymptotically large volume) number density. In surveys a similar problem is encountered from uncertainties 
and biases in the average number density of objects - in both cases the problem lies with the n term in Eq. [8] rather 
than the T{r) term. 

For ACDM simulations using large boxes (Pbox ^ 2h~^ Gpc) the integral constraint is a minor issue, but for 
simulations with smaller boxes this is an important concern. Notably, 8] presented simulations with Pbox = 50 — 
100/i~^ Mpc without any kind of corrections for this effect. As a remedy consistent with the more sophisticated 
treatments of ^] and [27l|, [23!, Appendix B] proposed that the correction should take the form 



£.{f) = £.raea.sir) -H^L(Pbox) 



(A3) 



23 



100 




100 



a 




15 20 25 

r (h^'' Mpc) 

FIG. 16. Matter correlation lunction results from a ACDM ensemble of simulations (Lbox = 100^~^ Mpc, A'' — 64'^, 100 
realizations) using standard (P-sampled) ICs. The left panel shows the ^(r) results from this simulation set without applying 
the integral constraint correction described in the text. The right panel shows the results from including the corrections in 
Eg. IA3I Error bars show the error on the mean. Note that the earliest output (a = 0.1, shown in blue) is severely affected by 
transients from the initial conditions. 



where ^meas(?') is as in Eq. [8] and ^L(^box) — jj^''"'' r^^i(r)dr where ^^(r) is the hnear theory correlation 

function and i?box = (47r/3)^^/'^Lbox = ibox/1-61. 

A simple derivation can be used to estimate the bias introduced by the integral constraint for large boxes assuming 
a ACDM initial power spectrum. In this case, 



C(^box) = , „3 / ■^'^r^^ACDM{r)dr = 



4^-Rbox Jo 4^i?3^^ 



'^■^■r'^£.KCDM{'r)dr - / '^T^r'^iKCDM{'r)dr 

Jflbox 



(A4) 



The integral over infinity is eqivalent to P{k 0) which goes to zero because P{k) ^ k on large scales. The other term 
within the brackets can be approximated analytically since on scales larger than r ~ 250/i~^ Mpc, £,acdm ~ Co(''o/^)'* 
where rg is a constant and the amplitude, is negative. It can be easily shown that 

e(i?box) « 3(-eo) = 20.16(-^o) f ■ (A5) 

Applying this result to estimate the fractional bias in the amplitude of the BAO feature yields 

e(rbao) - l(rbao) ^ g(^box) ^ ^ ^^^^ / 1^^:^\ ' ^^^^ 
UJ^bao) U»"bao) V -^box / 

where ^(rbao) is the uncorrected correlation function and I have assumed (— ^o)/C(''bao) ~ 2.71e-4 and tq ~ lh~^ 
Gpc, using CAMB [62] and parameters from WMAP7 [g^. Formally, because of a cancellation of the square of the 
linear theory growth function in the ratio (— Co)/C(''bao)j Eg. IA6I is independent of redshift and, if left unaccounted 
for, this measurement bias will propagate to change inferences regarding the broadening and shift of the BAO feature 
in the correlation function as well regardless of epoch. In more detail, redshift-dependent contributions arising from 
higher-order correlations can also bias the correlation function [56], so in practice Eq. lA6l can be thought of as a lower 
bound. 

Fig. [in] shows the importance of integral constraint with a set of P-sampled results from ACDM simulations 
{n,n ~ 0.27, flA = 0.73, h — 0.71, as = 0.84) that closely mimics the fiducial simulation set in Q (A^ = 64^ particles, 
ibox — lOOh^^ Mpc, 100 realizations with Zi = 39). Note that in this case the box size is small enough that the 
f(^box) correction must be evaluated numerically. 

Fig. [inican be directly compared to Fig. 9 of Although the first output is severely affected by transients from 
the initial conditions I include it for comparison with [8] . The integral-constraint correction increases the correlation 
function at l/4th scale of the box scale by almost a factor of two, considerably improving the agreement with linear 
theory. Without including this correction it is tempting to conclude incorrectly that the standard P-sampled method 
dramatically fails in modeling the correlation function unless box sizes much larger than the scales of interest are 
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used. I apply this correction to all of the P-sampled results shown in this paper, ^-sampled results do not require 
this correction, and Eq. [S] naturally takes this effect into account. In fact, the correction to ^(r) in Eg. IA3I bears a 
close resemblance to the form of Eq. [9l as I discuss in § IVII 



Appendix B: The Integral Constraint in Halo Bias Measurements 

As discussed in Appendix [XI correlation function measurements in P-sampled simulations suffer from an integral 
constraint imposed on the scale of the box. Since this measurement bias arises from incorrectly assuming that the 
density of pairs in the simulation volume is the same as the density of pairs as in an infinite volume, these conclusions 
extend to clustering measurements of any distribution of points in a finite volume (e.g. dark matter particles, halos, 
quasars). The purpose of this appendix is to examine this subtlety for the task of computing the halo clustering bias 
from a set of P-sampled simulations. 

For bias measurements I am interested in the quantity, 

1,2 _ ^hh{r) ^-^^^ 



where ^hhij") is the halo- halo correlation function and ^mm(r) is the matter- matter correlation function, elsewhere 
referred to in this paper as ^{r). Using the corrected values of Oift(^) ^'^d £,mmif) yields, 

imm{r) +^(Pbox) 

where £,hhir) are ^mmi'r) are uncorrected correlation function measurements. Moving the denominator of Eq. lB2l to 
the left hand side, 

(Lrn {r) + C"(i?box)) = ihh {r) + (B3) 

the 'f(Pbox) terms cancel, leaving just 

,2 ^hhir) 



^rnrn (^) 



(B4) 



It is unclear whether this subtlety was noticed in previous studies where bias measurements from pure powerlaw 
simulations using real-space statistics have been made [H, H^]. In any case, this appendix argues that the correction 
cancels, at least to first order. More sophisticated models exist for correcting for the integral constraint (e.g. [TtI) 
and obtaining unbiased estimators of the halo clustering bias. The bias results presented in Fig. [5] were determined 
using the simple formula of Eq. IB4I 



Appendix C: Modeling of the Effect of the Box Scale on the Growth of Structure in Simulations 

Seto [4^ provides arguments that the integral expressions for standard 1-loop perturbation theory (FT) (6^ can 
be used with a truncated initial power spectrum to capture the effect of the box scale on the growth of structure in 
cosmological N-body simulations. Formally, then, if 

{0 k < fcbox 

Afc" fcbox < fc < fee (CI) 
k> kc 

represents the initial power spectrum that is actually represented in the simulation, then the predicted non-linear 
power spectrum for k > fcbox is 

Pl-loop,trunc(fc) - PL{k) + P2,trunc(fc) (C2) 

where 



2,trunc 

(fc) — P22,trunc(A:) -|- Pl3,trunc(fc) 



(C3) 
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FIG. 17. A zoomed-in plot of small k/k\^ox results from P-sampled simulations with n — —1.5 (left panel) and n — —2 (right 
panel) originally presented in Fig. [T] Solid lines show an expectation for the deviations from self-similar behavior caused by 
the suppression of non- linear growth from the finite scale of the simulation box (Eg. ICll] inspired by [i^). Colors correspond 
to outputs at different epochs, as in Fig. [5] and [3 with blue corresponding to the first output in each case (a/a* — 0.038 for 
n = —1.5 and a/a, — 0.080 for n — —2) and orange corresponding to the last output (a/a, ~ 0.5 for both powerlaws). 
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[45| shows the ratio ^2,trunc(^)/^2(^) (i-e. with and without truncation) versus fc/fcbox for n = 1,0, —1 and —2 as an 
expectation for how much the box scale can change the non-linear growth. 

To compare with Fig.[71 although in principle Pi-ioop,trunc(fc)/^'ss(fc) = ^?-ioop.trunc(^)/^L(fc) is the correct quan- 
tity to display, since the large k predictions of 1-loop PT are typically not very accurate, it seems more meaningful to 
plot Pi-ioop,trunc(fc)/^'i-ioop(fc) which, at an early enough epoch, automatically converges to one (i.e. no modification 
of clustering caused by the box scale) since both numerator and denominator asymptote to Pl(A:). For the denomi- 



In terms of the dimensionless power spectrum. 



nator the analytic results from Appendix B of 65| are quite helpful, eliminating the need for numerical integration. 



65| (see also |66[) is of the form 



^l-loop(^) 



A? 



(fc) 



Alik) 



Ai(fc)(l 



■«nAi(fc)) 



(C6) 



where a„ is a constant and 



Ai(fc) = (fc/fcnl)"+' 



(C7) 



and 



A2(fc)=a„(fc/M''"+'^- 



(C8) 



Evaluating the expressions for a„ in [65| Appendix B gives a_i.5 = 0.33527 and a_2 — 1.56254. The n — —I 
powerlaw, which is UV divergent in Af_^^^p, is omitted from the results presented here since, even for quasi- reasonable 
assumptions for kc, the perturbation theory prediction for A'^^^oopik) so poorly resembles the true non-linear behavior 
that [45] found that the non-linear clustering is enhanced near the box scale (since P2,ti-unc{k) / P2{k) > 1 for all fc), 
rather than suppressed as seen in Fig. [7] or, e.g, [38l|. 
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For -3 < n < -1, then, 

^l-loop,trunc(^) _ A|(fc) + Al^j.^^^{k) 

ALioop(fc) " Ai(fc)+Al(fc) 

Ai(fc)/Ai(fc) + Al,,,„,(fc)/Ai(fc) 
1 + Ai(fc)/A2(fc) 

_ a-Hfc/fcni)-<"+^^ + A|,,„„,(fc)/Ai(fc) 

i + a;ri(fc/fc„i)-("+3) 



(C9) 
(CIO) 
(Cll) 



The term, A2 trunc(^)/A2(^)7 independent of epoch and is identical to the quantity plotted in 45] Fig. 3. Fig. [171 
shows the results from evaluating the numerator of this term numerically (as in ^M,) while using Eqs. IC8I for A2(fc). 
For reference, dashed black lines in Fig. [T7] show the maximally allowed clustering power, which corresponds to the 
^ni ^ limit where ^i—ioop,trunc 

(fc)/At,oop(fc)-^Ai,,_,(fc)/Ai(A:). 
Qualitatively, Eq. ICllI does bear some resemblance to the simulation results for k < lOfcbox- Note that for later 
outputs, in the deeply non-linear regime, one can hardl y e xpect perfect agreement and that the non-linear fitting 
function for n = — 2 is based on only one realization from [3a|, so the disagreement between Eq. lClll and. for example, 
the red (a/a, — 0.191) output may be due to poor constraints on A^j(fc) for fc/fcni <^ 1- (By contrast, the n = —1.5 
fitting function comes from 21 realizations 23] and is unlikely to be substantially different from what is assumed here.) 
With or without these caveats, clearly there is some promise in using FT schemes to anticipate how accurate a given 
simulation may be on scales approaching the box size. And, potentially, using one of the many other, more accurate 



FT schemes in the literature (see, e.g., 67[ and references therein) the comparison to the measured suppression of non- 
linear clustering power from simulation could easily be improved. These schemes should excel at robustly predicting 
P(fe) at low k. Typically, the concern in the literature is in the opposite regime of characterizing non-linearities at 
large k. 
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